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ABSTRACT 

It has long been known that stars with high metallicity are more likely to host giant planets than 
stars with low metallicity. Yet the connection between host star metallicity and the properties of 
small planets is only just beginning to be investigated. It has recently been argued that the metallicity 
distribution of stars with exoplanet candidates identified by Kepler provides evidence for three distinct 
clusters of exoplanets, distinguished by planet radius boundaries at 1.7and 3.9i?0. This would 
suggest that there are three distinct planet formation pathways for super-Earths, mini-Neptunes, and 
giant planets. However, as I show through three independent analyses, there is actually no evidence for 
the proposed radius boundary at 1.7i?0. On the other hand, a more rigorous calculation demonstrates 
that a single, continuous relationship between planet radius and metallicity is a better fit to the data. 
The planet radius and metallicity data therefore provides no evidence for distinct categories of small 
planets. This suggests that the planet formation process in a typical protoplanetary disk produces a 
continuum of planet sizes between 1 i?0 and 4i?0. As a result, the currently available planet radius 
and metallicity data for solar-metallicity F and G stars give no reason to expect that the amount of 
solid material in a protoplanetary disk determines whether super-Earths or mini-Neptunes are formed. 

Keywords: methods: statistical — planetary systems — planets and satellites: formation — stars: 
statistics 


1. INTRODUCTION 

The probability that a giant planet orbits a star is a 
steeply rising function of the host star’s me tallicity (e.g., 
Santos et al.f|2004' Fischer fc Valenti||2005 ). This obser- 
vation is the key piece of evidence that the giant planets 
identified by the radial velocity and transit techniques 
form through core accretion and not through gravita¬ 
tional instability. This observation is perhaps the most 
important constraint placed on models of planet forma¬ 
tion since the discovery of the first exoplanets. 

The connection between stellar metallicity and the 
presence of small planets is less clear. The Neptune-mass 
planets discovered by radial velocity surveys do not ap- 
pear to preferentially orbit metal-rich FGK stars (e.g., 
Sousa et al. 2008 Mayor et al. 2011). While Kepler 
has discovered a large number of small exoplanet can- 
didat es (planets from here), i t has not yet settled the 
issue. 


small-planet and large-planet subsamples for different 
choices of the radius boundary dividing the two sub¬ 
samples. They calculated the p-value from a two-sample 
Kolmogorov-Smirnov test on the two subsamples as a 
function of planet radius and identified local minima p- 
values. They saved the radii at which the local min¬ 
ima occurred. To account for measurement uncertain¬ 
ties, they repeated this process 10® times, sampling the 
planet radius and host star metallicity from their uncer¬ 
tainty distributions on each iteration. They identified a 
distinct p-value minimum at Rp = 1.7 i?0 and argued 
that it represents a boundary between terrestrial and 
“gas dwarf’ planets. This approach is inappropriate be¬ 
cause it performs a large number of hypothesis tests on 
the same data set without correcting the test thresholds 
to account for the large number of tests. That strat- 
egy is known to ir roduce a high false-discovery rate (e.g., 


Schlaufman fc LaughiinI (IM showed that while |Dunn|[l^ [M Moreover, the B14 technique creates 


the giant planets discovered by Kepler orbit metal-rich 
stars, the small planets discovered around F and G stars 


tion was later confirmed by 

Buchhave et al. 

(2012 

1 . 

Recently, 

Buchhave et al. 

(21)14) (B14 from here) ar- 

gued that the observed dis1 

nibution of metallicity in a 


sample of more than 400 Kepler planet host stars re¬ 
vealed three distinct clusters of exoplanets: terrestrial 
planets with planet radius Rp < 1.7 i?0, “gas dwarf’ 
planets with 1.7 i?0 ^ Rp ^ 3.9 i?0, and ice or gas gi¬ 
ants with Rp > 3.9 i?0. They suggested that these three 
populations formed via distinct planet formation chan¬ 
nels. 

To reach that conclusion, B14 repeatedly split their 
sample of planet host star metallicity measurements into 

^ Kavli Fellow. 


a sequence of p-values at many split points for data sub¬ 
ject to measurement uncertainty. Consequently, before 
attaching any significance to features in that sequence of 
p-values, it is also critical to ensure that the p-values that 
result from the Monte Carlo simulation accurately rep¬ 
resent the p-value measurement uncertainties that result 
from uncertainties in the input sample. 

There are at least four more problems with the analy¬ 
sis presented in B14. First, B14 overlooked the effect 
of planet radius uncertainty due to transit depth un¬ 
certainty. Second, their approach used an asymptoti¬ 
cally inconsistent estimator of the average p-value at each 
split point in the presence of observational uncertainty. 
Third, their analysis is subject to the multiple compar¬ 
isons problem, which reduces the significance of their ob¬ 
servation by a large amount. Fourth, while B14 assert 
that local minima in a plot of p-value as a function of 
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split radius indicate transitions between distinct clusters 
of exoplanets, this is not necessarily so. I describe my 
sample selection in Section 2, I detail each issue with the 
B14 calculation in Section 3, I outline a more rigorous 
way to investigate the issue in Section 4, and I discuss 
the implications and my conclusion in Section 5. 

2. SAMPLE CONSTRUCTION 

I use the planet host star data from B14. Those data 
include Teff,logg, [M/H], M*, i?*, and their associated 
uncertainties. B14 did not include in their planet radius 
uncertainties the effect of uncertainties in transit depth, 
even though transit depth uncertainties are more impor¬ 
tant than the host star radius uncertainties in 25% of the 
sample. As a result, I supplement the B14 data with the 
latest Kepler object of interest period and Rp/R^, esti¬ 
mates from the Kepler CasJobs databas^ hosted by the 
Mikulski Archive for Space Telescopes. I then recompute 
planet radii from the B14 stellar radii and the updated 
transit depths. Following B14, I remove from the sample 
all planets smaller than Si?® subject to strong stellar 
irradiation (i.e., F;, > 5 x 10® J s“^ as these plan¬ 

ets may have lost a significant fraction of their initial 
atmospheres. I plot these data in Figure 

3. ISSUES WITH THE BUCHHAVE ET AL. (2014) 
CALCULATION 

3.1. An Asymptotically Inconsistent p-value Estimator 

An asymptotically inconsistent estimator of a parame¬ 
ter does not converge to the true value of the parameter 
in the large-sample limit. One problem with the B14 
analysis is that they used an asymptotically inconsistent 
estimator of the p-value averaged over planet radius mea¬ 
surement uncertainty in their Monte Carlo simulation. 
The p-value measurements depend on the planet radius 
measurements, which are subject to measurement uncer¬ 
tainty in the inferred stellar radii i?* and the measured 
ratios i?p/i?*. The true p-values in the absence of un¬ 
certainty cannot be measured directly. Instead, one can 
only measure p' 


p'=p-p7V(0,cr^) , 

( 1 ) 

where p is the true p-value and Af(0, cr^) is due to mea¬ 
surement uncertainties in i?* and Rp/R^,. Repeatedly 
calculating p' after perturbing each planet radius due to 
the uncertainties in i?* and Rp/Ri, and averaging the re¬ 
sult will provide an asymptotically consistent estimate of 
p by the central limit theorem 

F[p']=F[p + fV(0,a2)] , 

(2) 

F[p'] = F[p]+F[iV(0,a")] , 

( 3 ) 

= + -^7V(0,a^ 

i—l i—1 i—1 

( 4 ) 

p' = p -|- 0 ^ p' = p. 

( 5 ) 


B14 never averaged the p-value produced for each split 
point over all iterations. Instead, after each iteration 
of their Monte Carlo simulation they identified the local 
p-value minima and saved them. After completing 10® 

^ http://mastweb.stsci.edu/kplrcasjobs/ 


Monte Carlo iterations, they determined the mean radii 
at which local p-value minima occurred by averaging over 
the individual radii calculated on each iteration. In other 
words, they applied the nonlinear function / that takes 
a sequence of p-values and identifies the radii of local 
p-value minima before averaging over all iterations to 
identify the mean radii at which p-value minima occur. 
The central limit theorem does not apply in this case, as 

fip') = f{p + N{0,<j'^)), (6) 

E[fip')]=E[f{p + N{0,a^))], (7) 

^ n 1 ^ 

- X]/(/) = -I]/(P +^(0,^2)), (8) 

fip') = fiP + N{0,a^)) ^ f{p') = f{p). (9) 

As a result, the p-values in their Figure I improperly 
account for measurement uncertainty and are asymptot¬ 
ically inconsistent with the true p-values absent measure¬ 
ment uncertainty. 

To address that problem, I first generate 10® realiza¬ 
tions of each planet radius from the distributions that re¬ 
sult from the propagation of measurement uncertainties 
in i?* and i?j,/i?*. I split the metallicity data into small- 
planet and large-planet subsamples at 321 split points 
from 0.3 to 13.1 i?® in steps of 0.04 i?® and compute the 
p-value from a two-sample Kolmogorov-Smirnov test. I 
save the resulting p-value for each split point and repeat 
this process 10® times. At the end of the calculation, I 
average the p-values for each split point. I plot the re¬ 
sult in Figure The apparent local minimum in the 
p-value distribution identified by B14 at Rp = 1.7 i?® is 
not present. 


3.2. The Multiple Comparisons Problem 

Another issue involves the multiple comparisons prob¬ 
lem. The multiple comparisons problem occurs in sta¬ 
tistical analyses when the same data is both us ed to 
select a mo del and estimate its parameters (e.g., |Ben-j 
jamini|2010 ). It frequently leads to the underestimate of 
the uncertainty of the model parameters. In this case, 
B14 used the same metallicity data to both identify the 
planet radius boundaries that separated the three dis¬ 
tinct clusters and to estimate the mean metallicity and 
associated uncertainty for each cluster. Since they used 
their data both to set the boundaries and determine the 
mean metallicities for each region, their analysis is sub¬ 
ject to the multiple comparisons problem. 

One way to correct for this problem is to use indepen¬ 
dent data sets, one to select the model and another to fit 
the model parameters. In this case, the correct approach 
is to split the metallicity data in half. The first half 
should be used to identify the planet radius boundaries 
that separate the three distinct clusters of exoplanets. 
The second half should then be used to infer the aver¬ 
age metallicity of each proposed cluster. This process 
can be repeated a large number of times with different 
randomly selected subsamples. Consequently, on each 
iteration of a Monte Carlo simulation, I split the data 
set described in Section 2 in half. I follow the approach 
of B14 and identify p-value minima at Rp < 2i?® and 
2 i?® < i?p < 4 i?®. I use those planet radii as the bound¬ 
aries of each exoplanet cluster and use the second half of 
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the metallicity data to compute the mean metallicity of 
each cluster. I repeat this process 10^ times. I find that 
the difference between the mean metallicities for the ter¬ 
restrial and “gas dwarf’ regions is only 0.7cr—much lower 
than the 3.1 ct offset reported by B14. 

Metal-poor stars are smaller than metal-rich stars, so 
a bias toward finding small planets around metal-poor 
stars in a transit-depth-limited survey is a systematic 
effect that will further decrea se the significance of this 
offset (Gaidos & Mann 20131. The fact that the mean 
metallicities of the stars on either side of the claimed 
transition at Rp = 1.7 i?0 are indistinguishable contra¬ 
dicts the B14 interpretation of the transition as evidence 
of different planet formation pathways. 


3.3. Do Local p-value Minima Indicate Distinct 
Exoplanet Regimes? 

While B14 argue that local minima in a plot of split 
radius versus p-value indicate transitions between dis¬ 
tinct exoplanet clusters, this is not always the case. To 
demonstrate this, I use the same Monte Carlo simula¬ 
tion described in Section 3.1. However, instead of using 
the observed metallicities, on each iteration I randomly 
sample the metallicities of stars hosting planets with 
Rp ^ 1.7 770 , 1.7 770 < Rp ^ 3.9 770 , and Rp ^ 3.9 770 
from their observed distributions. Those distributions 
are ^(0.00,0.20^), A(0.05,0.lfi^), and A(0.18,0.W^). I 
plot the result in Figure]^ Despite the fact that distinct 
metallicity distributions were imposed on each planet 
cluster, there is no local minimum in the p-value distri¬ 
bution at the boundary between the terrestrial and “gas 
dwarf’ planets. The inability of the B14 technique to 
identify a metallicity boundary imposed by construction 
as a local p-value minimum implies that the technique is 
not sensitive to subtle features in the metallicity distri¬ 
bution. 


4. A MORE RIGOROUS APPROACH 

A better way to identify the number of subpopulations 
required by the planet radius and host star metallicity 
data is to compare statistical models with varying num¬ 
bers of components, then identify the model that has the 
minimum number of parameters yet the maximum like¬ 
lihood of producing the observed data. I consider two 
classes of models. First, I fit single-population linear 
models of the form 

m 

[M/H]=ao + ^a,77^ + e, (10) 

7=1 

where e is the standard uncertainty term in the regression 
equation. Second, I fit finite Gaussian mixture models 
with varying numbers of subpopulations of the form 

n m 

( 11 ) 

i=l 7=1 

where x is the data, m is the number of components in 
the model, the wj are weights such that 
and each Nj is a two-dimensional Gaussian component 
of the overall density 


exp{-^(x, i(x, -pj)|(12) 

Here fXj and Sj are the mean and covariance of each 


mixture models 

using the mclustr 

package in fFI 

Fraley 

& Raftery||2002 

Fraley et al.||2012 

R Gore Team 

2014). 


a Monte Carlo simulation. I sample the planet radii 
from the distributions that result from the propagation 
of measurement uncertainties in 77* and 77p/77* and di¬ 
rectly use the measured metallicities (since the uncer¬ 
tainty in [M/H] is already reflected in the uncertainty in 
77*). On each iteration, I fit linear models of the form of 
Equation (10) for m = 1, 2,..., 5 and Gaussian mixture 
models m=l,2,...,7. I choose both the best linear and 
Gaussian mixt ure models usin g the Bayesian information 


criterion (BIG; Schwarz 

1978 

) , then use the Akaike infer- 

mation criterion (AlC; 

Akai 

ke 

1974 

) to choose between 


peat this process 10^ times. In all cases, the best linear 
model is preferred. For the linear model, the m = 1 
model is favored 76.4% of the time, the m = 2 model is 
favored 22.3% of the time, while a higher-order model is 
favored 1.3% of the time. While the Gaussian mixture 
model is disfavored relative to the linear model, the two- 
component model is the best of the mixture models: the 
two-component model is preferred on 92.8% of the itera¬ 
tions, while the three-component model is preferred 7.2% 
of the time. I plot representative examples of the BIC- 
selected models from one iteration of my Monte Carlo 
simulation in Figure]^ 

5. DISCUSSION AND CONCLUSION 

The performance of a large number of tests on the same 
data set without correcting the test thresholds, the use of 
an asymptotically inconsistent estimator of the p-value in 
the presence of measurement uncertainty, the oversight 
of the multiple comparison problem, or the inability of 
the B14 technique to identify an imposed metallicity ef¬ 
fect as a p-value minimum are all sufficient reasons to 
be skeptical of the claimed transition at Rp = 1.7770. 
The problems with the B14 analysis technique cannot 
be mitigated by examining a larger or independent data 
set—they are inherent in the analysis technique itself. 
Instead, the analysis in Section 4 shows that a smooth, 
one-component linear model is a better fit to the data 
than any multi-component model. If a multi-component 
model is used, then the two-component model is consis¬ 
tently a better fit than the three-component model. As 
a result, the planet radius and metallicity data for the 
Kepler F and G star planet hosts does not support the 
idea of multiple types of small planets. Instead, a con¬ 
tinuum of planet sizes between 1770 and 4 7?0 are likely 
formed independent of the amount of solids present. 

While observational evidence suggests that most plan¬ 
ets larger than about 2 770 have sig nificant hydrogen at¬ 
mospheres (e.g., Marcy et al. 2014), Kepler-lOc is an ex- 
ceptio n with a radius ot 2.35 77m and a density of 7.1 g 
cm“^ (Dumusque et al.||2014|. Likewise, smaller planets 
probably have a wide range of atmospheric properties 


® http://www.stat.washington.edu/mclust/ 
^ http://www.R-project.org/ 
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(e.g., Rogers|[2014 Wolfgang fc Lopez||2014 1. Moreover, 
a wide range of densities can be present eve n in the same 


syste m, with Kepler-36 the best example (Carter et al. 


20121. For these reasons, near solar metallicity it does 


not seem likely that the final masses or compositions of 
small exoplanets are controlled primarily by the amount 
of solid material present in their parent protoplanetary 
disks. 
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Planet radius [REarth] 


Figure 1. Planet radius Rp vs. host star metallicity. 
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Figure 2. Mean p-value as a function of planet radius. I split the sample into small-planet and large-planet subsamples at 321 split points 
from 0.3 to 13.1in steps of 0.04 Rq and compute the p-value from a two-sample Kolmogorov-Smirnov tests on the metallicity 
distributions of both subsamples. I repeat this process 10^ times. The black points are mean p-values averaged over the uncertainties in 
host star radius R* and transit depth (Rp/R*)^. I indicate the uncertainty at each radius as a semi-transparent gray rectangle with height 
given by the uncertainty in the p-value and width 0.02R^. After accounting for the uncertainties in R* and (Rp/R*)^, the p-values do 
not support the idea of a qualitative difference between planets with radii above or below 1.7 Rq. 
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Figure 3. Mean p-value as a function of planet radius in the scenario advocated by B14 for three distinct planet clusters separated at 
1.7i?0 and 3.9i?0 with metallicity distributions W(0.00, 0.20^), A^(0.05, 0.19^), and A^(0.18, 0.19^). If planet radius boundaries at 1.7 
and 3.9 do separate the exoplanet population into three clusters with unique metallicity distributions, then that difference would 
manifest itself as a non-continuous first derivative—a “kink”—at 1.7 R^. Even though three unique metallicity distribution were imposed 
by construction in this case, there is no p-value minimum at 1.7 R^. Consequently, even if there were three distinct clusters of exoplanets, 
each with a unique metallicity distribution, the analysis described in B14 would not be able to identify them by p-value minima. 




SCHLAUFMAN 


0.4 


0.2 

^ 0.0 

o 

- 0.2 


- 0.4 


0 5 10 15 

Planet radius [REarth] 

Figure 4. Two different models for the relationship between Rp and metallicity. Left: a one-component, linear relationship between 
Rp and metallicity. The blue line shows the best-fit model, while the green-shaded region shows the 25% and 75% quantile regression 
bands computed using the quantreg package ||Koenker||2013||. Right: a Gaussian mixture model with two components. Planets plotted 
as blue circles are assigned to one component, wtiile gray squares are assigned to the other. The divide between the two populations 
occurs at Rp ^ 4i70. While the two-component Gaussian mixture model is favored over mixture models with one to seven components, 
the one-component linear model is favored by the Akaike information criterion (AIC) over any of the mixture models. For that reason, a 
one-component smooth model is currently the best match to the Kepler planet radius and metallicity data for F and G stars presented in 
B14. 
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